Code positioning to reduce instruction cache misses in signal processing applications on multimedia RISC processors
نویسندگان
چکیده
Real-time operation of signal processing applications on multimedia RISC processors is often limited by high instruction cache miss rates of direct-mapped caches. In this paper, a heuristic approach is presented which reduces high instruction cache miss rates in direct-mapped caches by code positioning. The proposed algorithm rearranges functions in memory based on trace data so as to minimize cache line con icts. Moreover, a new method to extract potential cache misses from trace data is introduced which enables accurate cache behavior analysis and greatly enhances code positioning e ciency. Application of code positioning to an MPEG-1 video decoder implementation on the V830 multimedia RISC processor reduced instruction cache re ll cycles by 66{98 %. The proposed code positioning algorithm does not require hardware modi cations; it can easiliy be integrated in an object linker to automate the optimization process.
منابع مشابه
Evaluation of architectural support for speech codecs application in large-scale parallel machines
Next generation multimedia mobile phones that use the high bandwidth 3G cellular radio network consume more power. Multimedia algorithms such as speech, video transcodecs have very large instruction foot prints and consequently stalled due to instruction cache misses. The conflicts in on-chip caches contribute a large fraction of the CPU cycle penalty and hence increase in power consumption. Ma...
متن کاملTemporal Distribution Based Software Cache Partition To Reduce I-cache Misses
As multimedia applications on mobile devices become more computationally demanding, embedded processors with one level I-cache become more prevalent, typically with a combined I-cache and SRAM of 32KB ~ 48KB total size. Code size reduction alone is no longer adequate for such applications since program sizes are much larger than the SRAM and I-cache combined. For such systems, a 3% I-cache miss...
متن کاملImproving Memory-System Performance of Sparse Matrix-Vector Multiplication
Sparse matrix-vector multiplication is an important kernel that often runs inefficiently on superscalar RISC processors. This paper describes techniques that increase instruction-level parallelism and improve performance. The techniques include reordering to reduce cache misses originally due to Das et al., blocking to reduce load instructions, and prefetching to prevent multiple load-store uni...
متن کاملCode Positioning for VLIW Architectures
Several studies have considered reducing instruction cache misses and branch penalty stall cycles by means of various forms of code placement. Most proposed approaches rearrange procedures or basic blocks in order to speed up execution on sequential architectures with branch prediction. Moreover, most works focus mainly on instruction cache performance and disregard execution cycles. To the bes...
متن کاملMultimedia Processors - Proceedings of the IEEE
This paper describes recent large-scale-integration programmable processors designed for multimedia processing such as real-time compression and decompression of audio and video as well as the generation of computer graphics. As the target of these processors is to handle audio and video in real time, the processing capability must be increased tenfold compared to that of conventional microproc...
متن کامل